Structured Sparse Canonical Correlation Analysis
نویسندگان
چکیده
In this paper, we propose to apply sparse canonical correlation analysis (sparse CCA) to an important genome-wide association study problem, eQTL mapping. Existing sparse CCA models do not incorporate structural information among variables such as pathways of genes. This work extends the sparse CCA so that it could exploit either the pre-given or unknown group structure via the structured-sparsity-inducing penalty. Such structured penalty poses new challenge on optimization techniques. To address this challenge, by specializing the excessive gap framework, we develop a scalable primal-dual optimization algorithm with a fast rate of convergence. Empirical results show that the proposed optimization algorithm is more efficient than existing state-of-the-art methods. We also demonstrate the effectiveness of the structured sparse CCA on both simulated and genetic datasets.
منابع مشابه
An Efficient Optimization Algorithm for Structured Sparse CCA, with Applications to eQTL Mapping
In this paper we develop an efficient optimization algorithm for solving canonical correlation analysis (CCA) with complex structured-sparsity-inducing penalties, including overlapping-group-lasso penalty and network-based fusion penalty. We apply the proposed algorithm to an important genome-wide association study problem, eQTL mapping. We show that, with the efficient optimization algorithm, ...
متن کاملStructured sparse canonical correlation analysis for brain imaging genetics: an improved GraphNet method
MOTIVATION Structured sparse canonical correlation analysis (SCCA) models have been used to identify imaging genetic associations. These models either use group lasso or graph-guided fused lasso to conduct feature selection and feature grouping simultaneously. The group lasso based methods require prior knowledge to define the groups, which limits the capability when prior knowledge is incomple...
متن کاملThe RGCCA package for Regularized/Sparse Generalized Canonical Correlation Analysis
2 Multiblock data analysis with the RGCCA package 1 2.1 Regularized Generalized Canonical Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Variable selection in RGCCA: SGCCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 Higher stage block components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.4 Implementatio...
متن کاملSparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملSparse Kernel Canonical Correlation Analysis
We review the recently proposed method of Relevance Vector Machines which is a supervised training method related to Support Vector Machines. We also review the statistical technique of Canonical Correlation Analysis and its implementation in a Feature Space. We show how the technique of Relevance Vectors may be applied to the method of Kernel Canonical Correlation Analysis to gain a very spars...
متن کامل